FILTER MODE ACTIVE

#supervised fine-tuning

Records found: 6

#supervised fine-tuning06/10/2025

78 Examples, Massive Gains: LIMI Turns Tiny Datasets into Powerful Software Agents

'LIMI uses 78 curated, tool-grounded trajectories to fine-tune GLM models, hitting 73.5% on AgencyBench and outperforming large-sample baselines by a wide margin.'

READ →

#supervised fine-tuning02/10/2025

Apriel-1.5-15B-Thinker: Frontier Multimodal Reasoning on a Single GPU

'Apriel-1.5-15B-Thinker is a 15B open-weights multimodal reasoning model that achieves a 52 AAI score and fits on a single GPU, offering reproducible training artifacts and competitive benchmark results at low cost.'

READ →

#supervised fine-tuning08/09/2025

RL's Edge: MIT Shows Reinforcement Learning Cuts Catastrophic Forgetting vs Supervised Fine-Tuning

'MIT shows that on-policy reinforcement learning preserves prior capabilities better than supervised fine-tuning by minimizing forward KL divergence between the base and fine-tuned models.'

READ →

#supervised fine-tuning24/08/2025

Prefix-RFT: Guiding LLMs with Partial Demonstrations to Merge SFT and RFT

Prefix-RFT blends supervised and reinforcement fine-tuning by using partial demonstration prefixes to guide exploration, achieving stronger and more stable performance on math reasoning benchmarks than SFT, RFT, and hybrid baselines.

READ →

#supervised fine-tuning14/06/2025

OpenThoughts: Advancing Scalable Data Curation for Cutting-Edge Reasoning Models

OpenThoughts introduces a scalable supervised fine-tuning pipeline that significantly enhances reasoning datasets and models, achieving state-of-the-art performance in math, coding, and science domains.

READ →

#supervised fine-tuning09/06/2025

Yandex Launches Alchemist: A Precision Dataset to Boost Text-to-Image Model Quality

Yandex has launched Alchemist, a compact supervised fine-tuning dataset that significantly improves text-to-image model quality by selecting high-impact image-text pairs using a novel model-guided approach.

READ →